Progressive privacy-preserving batch retrieval of lung CT image sequences based on edge-cloud collaborative computation

Background A computer tomography image (CI) sequence can be regarded as a time-series data that is composed of a great deal of nearby and similar CIs. Since the computational and I/O costs of similarity measure, encryption, and decryption calculation during a similarity retrieval of the large CI sequences (CIS) are extremely high, deploying all retrieval tasks in the cloud, however, will lead to excessive computing load on the cloud, which will greatly and negatively affect the retrieval performance. Methodologies To tackle the above challenges, the paper proposes a progressive privacy-preserving Batch Retrieval scheme for the lung CISs based on edge-cloud collaborative computation called the BRS method. There are four supporting techniques to enable the BRS method, such as: 1) batch similarity measure for CISs, 2) CIB-based privacy preserving scheme, 3) uniform edge-cloud index framework, and 4) edge buffering. Results The experimental results reveal that our method outperforms the state-of-the-art approaches in terms of efficiency and scalability, drastically reducing response time by lowering network communication costs while enhancing retrieval safety and accuracy.


Introduction
With the rapid growth of the number of medical images and the increasing demand for remote diagnosis, content-based mobile retrieval for Computed tomography Image Sequences (CIS)s in telemedicine systems (TSs) [1] plays an increasingly important role in disease diagnosis in recent years. mobile-terminal-based high-resolution CIS retrieval enables medical professionals to identify lesion tissues in aberrant CIs and carry out the computer-assisted diagnosis and treatment. The motivations of the CIS retrieval in edge-cloud collaborative computing mode are based on the following key observations: • Instead of using the whole CIS, traditional CI retrieval takes a single CI as the retrieval one to perform similarity comparison which is ineffectual and inadequate in modeling the whole retrieval CIS leading to poor retrieval precision ratio; • As the CIS data belongs to patients' personal privacy information [2], it should be encrypted during the retrieval processing; otherwise, the personal information leakage will take place; • To better understand the condition of their patients during remote consultations, doctors will frequently retrieve and examine their CISs in real time, which involves high computational costs, as well as the intensive transmission of the CISs. Therefore, deploying and executing all expensive retrieval and computing tasks in the cloud will result in significant computational overhead and have a negative impact on the retrieval's performance improvement. To efficiently reduce the load of cloud computing, edge computing [3] came into being. As a new distributed computing mode, edge computing makes up for some shortcomings of cloud computing and diverts most computing tasks to edge device nodes (i.e., edge server (ES)) around the mobile terminal. This can not only significantly reduce the computing load on the cloud, but also reduce the transmission cost [4,5] to support the retrieval in real time [6]; • For these mobile terminals whose computing resources are constraint such as the battery reserves, screen resolutions and computational powers, etc. The data transmission is negatively affected by the unstable network bandwidth which causes delays in the data retrieval and transmission, especially in rural areas with inadequate mobile communication infrastructure [1].
Based on the above analysis, the paper presents a privacy-preserving Batch Retrieval method for large lung CISs in the edge-cloud computing network, called the BRS, by analyzing the similarity of the nearby CIs in the sequence. There are few studies on how to speedup the batch similarity retrieval of the large CISs using the edge-cloud collaborative computing environment. Specifically, when a user submits a retrieval CIS(X R ), firstly, an index mechanism at the edge layer called eIndex is used to quickly judge whether there are some answer CISs similar to X R in the edge buffer. If exists, then a high-dimensional similarity retrieval of the partial answer ones supported by the cIndex is carried out by accessing the cloud; otherwise, the similarity retrieval of all CISs supported by the cIndex is performed directly through the cloud; finally the answer CISs are returned to the user node. The extensive experiments demonstrate the effectiveness, efficiency, and scalability of the BRS method.

Background
Over the past fifty years, content-based image retrieval(CBIR) has been a persistent and difficult research problem [6][7][8][9][10]. Due to the 'semantic gap', however, the retrieval accuracies are still not satisfactory.
As one of the key subfields of the CBIR, content-based medical image retrieval (CBMIR) research has become increasingly popular in recent years. The first CBMIR system built for high-resolution lung CIs is ASSERT [11]. After that, many prototype systems were developed, including IRMA [12], FIRE [13], and others. A noisy image bag-based technique for retrieving medical images was presented by Huang et al. [14]. To further reduce the 'semantic gap', Huang et al. [15] developed a relevance feedback technique for the CBMIR based on a noisysmoothing model. Kitanovski et al. [16] designed a multi-modality-based CBMIR system. Lan et al. [17] proposed a simple texture feature extraction algorithm for the CBMIR. A multipanel medical image segmentation framework for the CBMIR system was supplied by Ali et al. [18]. Based on the fusion of the wavelet optimization and adaptive block truncation coding, Kasban et al. [19] built a reliable CBMIR system. Tuyet et al. [20] used the deep learning techniques to support the salient region-based CBMIR.
Since the aforementioned CBMIR systems are based on single-PC mode, their retrieval performances are not satisfactory when dealing with a great deal of medical images [21]. Anbarasi et al. [22] developed a distributed CBMIR system using distributed database techniques. Charisi et al. [23] designed a parallel CBMIR scheme in a peer-to-peer(P2P) network. Based on the hybrid features, Depeursinge et al. [24] proposed a mobile access approach to peer-reviewed medical information. Although Zhuang et al. [25] put forward an efficient and robust CBMIR technique in a mobile wireless network, the retrieval efficiency is poor since the effectiveness of the load balance strategy needs to be further improved. Based on the previous work [25], Zhuang et al. [26] introduced a high-performance batch retrieval technique for medical images in wireless network from a standpoint of multi-retrieval optimization to further improve the retrieval efficiency. A mobile teleradiology system [27] is appropriate for streamlining the CBMIR procedure. For telemedicine applications, Chitra et al. [28] suggested an enhanced retrieval approach for brain images utilizing carrier frequency offset adjusted OFDM technique. To solve the 'semantic gap', Jiang et al. [29] introduced a novel framework of mobile similarity retrieval of medical images based on a crowdsourcing model.
On the basis of the CI analysis, Lei et al. [30] developed a sparse CNN model-based highresolution CI retrieval technique. Yu et al. [31] presented a liver CI retrieval algorithm based on a non-tensor product wavelet. Based on an adder combining two local bit plane-based dissimilarities, Hatibaruah et al. [32] introduced a novel CI retrieval approach. Hwang et al. [33] applied a CBIR and CNN techniques to enable diffuse interstitial lung disease retrieval. To facilitate the effective diagnosis of the lung cancer, Alzubi et al [34] designed a boosted neural network ensemble classification approach.
Despite extensive study of the CI retrieval, the majority of approaches still rely solely on this retrieval without taking the CIS retrieval into account. Meanwhile, very little research has addressed CIS retrieval in the collaborative edge-cloud environment.

Preliminaries and preprocessing
Firstly, the main symbol notations are listed in Table 1. Fig 2 depicts the three-layer network architecture of the BRS system, which is formally stated in Definition 1. ii) N E represents the edge nodes that are used for: 1) temporally storing the CISs buffered at N E ; and 2) sending back the partial answer CISs to N U ; iii) N C represents the cloud nodes which are used for: 1) partition processing of the IBs; 2) encryption and storage of the CIBs; storing the NIB replicas, and 3) sending back the answer CISs to N U ; • E denotes a collection of edges representing the different network bandwidths for data communication at time T, formally denoted as: E = < e 1 , e 2 , . . ., e |E| >, where e k = (N i , N j ) refers to the k-th edge in G in which N i and N j are connected.
Definition 2 (POA). A pathological object area (POA) in a CI can be modeled by a two-tuple: As indicated in introduction section, there are usually some lesion tissues that the doctors may focus on in the CISs. The region of such lesion organ in the CIS is called the pathological object area (POA). In the preprocessing step, the POAs need to be preliminarily marked by medical specialists; then each CI in the sequence is equally divided into some IB (i.e., NIB and CIB) replicas, with the CIBs being encrypted and saved at their original pixel resolutions while the NIBs are stored at a lesser resolution. As illustrated in Fig 3, there are two POAs (A and B) and one NPOA (i.e., C) in an example CI which can be segmented into 6 × 8 IBs marked by red dash lines.

Methodologies
In this section, we first introduce four supporting techniques based on which a BRS algorithm is proposed next.

Supporting techniques
To better facilitate the batch retrieval of the lung CISs in the MECN, in this subsection, we introduce four supporting techniques: 1) batch similarity measure for CISs, 2) CIB-based privacy preserving scheme, 3) uniform edge-cloud index framework, and 4) edge buffering.
Batch similarity measure for CISs. As mentioned before, a CIS X i is a time-series data which can be modeled by a vector: X i = {CI 1 , CI 2 , . . ., CI |Xi |}, where |X i | means the number of CIs in X i . Due to the large amount of the CIs in a CIS, to effectively reduce the high computation cost in the CIS similarity matching, we propose a representative CI(RCI)-based batch similarity measurement of the CISs.
Before introducing the batch similarity measure, how to extract the RCIs is a challenging issue. As summarized in Algorithm 1, given a CIS X i , a RCI extraction processing of X i is first performed to obtain ||X i || RCIs from a CIS, where ||X i || means the number of RCIs in X i , d(x, y) is stated in Table 1, and ε is a small positive threshold. Once the ||X i || RCIs are extraction from X i , X i can be re-represented as: . . . ; RCI kX i k g. So given two CISs (X m and X n ), their batch similarity can be defined as follows: As can be seen from Eq (4), the similarity of two CISs can be measured by the percentage of similar RCIs in the two corresponding CISs. CIB-based privacy-preserving scheme. Before introducing the CIB-based privacy-preserving scheme, let's first give a definition.
Definition 6 (POAR). Given a POA(i.e., POA j ), its corresponding POA-related region (POAR) consists of all CIBs in POA j , subjecting to the following criteria: ( where POAR(POA j ) means the corresponding POAR of POA j , |•| denotes the number of CIBs in •.
In Fig 3, there are two POAs (i.e., A and B) in the CI that is equally segmented into 6 × 8 IBs. Based on Definition 6, the corresponding POARs of the two POAs are represented by the green shadow areas which consist of 20 CIBs. Since the nearby CIB numbers have the characteristics of continuous distribution, it is easier to use these CIBs to reconstruct the original CI. As a result, the objective of the encryption strategy is to disrupt the continuity of the ID numbers of the nearby CIBs in the CI by encoding the ID numbers of the CIBs such that the CI reconstruction is hard to perform.
So for each CIB in a CI, we first introduce a encoding scheme (IBID) of the ID numbers of the above CIBs, which is represented in Eq (6): where SID mean the ID of the CIS that the CIB belongs to, IID refers to the ID of the CI in which the CIB is contained, rID is row ID, cID is column ID, c1, c2, c3 are stretch constants and c1 >> c2 >> c3. Based on the ID numbers of the CIBs in Eq (6), their encryption and decryption strategies are described as follows: 1) Encryption strategy: Algorithm 2 details the steps of the CIB-based encryption processing in which the ID numbers of the CIBs are encrypted, where δ and ω are two key values and δ < SID, δ < IID and ω < rID.
Algorithm 2 Encryption() input: SID, IID, rID, cID of a CIB output: IBID: the encrypted ID number of the CIB 1: if rID is an odd number and cID is an odd number then 2: rID is an odd number and cID is an even number then 4: : else if rID is an even number and cID is an odd number then 6: : else if rID is an even number and cID is an odd number then 6: For instance, assume that SID is 7, IID is 4, c 1 , c 2 , c 3 are 1000, 100 and 10, respectively, then the original ID numbers of the CIBs before encryption are depicted in Fig 4(a). Fig 4(b) shows the encrypted ID numbers of the CIBs after encryption when δ = 3 and ω = 0.6. Fig 4(a) shows the continuous distribution of the ID numbers of the nearby original CIBs before encryption. After encryption, as illustrated in Fig 4(b), the ID number distribution of the nearby CIBs is discrete. Therefore, the encryption of the CIBs makes it more and more difficult to find the corresponding nearby CIBs in the CI reconstruction.
Next, we proceed to analyze the probability of the successful decryption (i.e., the probability of accurate image reconstruction). Given a CI with m POAs, for each POA(i.e., POA i ), the rows and columns of its corresponding POAR can be denoted as RS i and CS i , respectively. Then, the probability that the decryption processing is successful can be derived in Eq (7): Based on Eq (7), with increasing number of the CIBs in a CI, the probability of the successful decryption becomes smaller and smaller which guarantees the hardness of the decryption from a theoretical level. The encrypted CIBs are stored in N C or N E which ensures the corresponding CIBs' IDs in a CI presents a discrete distribution rather than continuity to a certain extent. The reconstruction and display of the CIs are conducted at N U by reversely decrypting the ID numbers of the CIBs based on the key values (i.e., δ and ω). Uniform edge-cloud indexing framework. To support faster CIS filtering processing, we propose a uniform edge-cloud index framework (UECIF) based on iDistance [35], in which the UECIF is composed of two types of indexes: the eIndex in N E and the cIndex in N C .
• Index Construction For the eIndex, initially, suppose that the CISs in O are virtually stored in N E , which means the CISs in O are physically stored in N E , they are logically, however, not buffered in N E . Then, the CISs are first grouped into the K clusters by the AP-cluster [36] based on visual similarity (i.e., Eq (4)). Given a CIS X i , its index key can be represented below: where X j r is the cluster centre of the j-th cluster that X i belongs to, sim(�, �) represents the visual similarity distance function (i.e., Eq (4)), j 2 [1, K], and the constant c 1 is used to stretch the value range.
The index key is inserted into an improved B + -Tree in which a leaf node (LNode) can be modeled by a triplet: LNode(X i ) = < key, value, EType >, where EType = 'F' means X i is not buffered in N E ; otherwise, X i has been buffered in N E . Algorithm 4 summarizes the initial construction process of the eIndex in which LNode(X i ).EType 'F'(line 6) means all of the CISs are virtually stored in N E .

Algorithm 4 eIndex construction(O)
input: Ω: the CIS set output: eIdx 1: eIdx NULL; . initialize 2: the CISs in Ω are grouped into K clusters; . at edge node 3: for each CIS(X i ) in Ω do 4: insert key(X i ) into an improved B + -tree(i.e., eIdx); 6: LNode(X i ).EType 'F'; 7: end for 8 return the eIdx; Similar to the above, for the cIndex, first of all, the clustering processing of the CISs(O) in N C is performed to obtain T clusters based on the above visual similarity. For a CIS X i , its index key can be derived as: where j 2 [1, T], and other parameters and symbols are the same to that of in Eq (8). In Algorithm 5, the index key is inserted into an improved B + -Tree in which a leaf node(LNode) can be represented by a triplet: LNode(X i ) = < key, value, CType >, where LNode(X i ).CType 'T' means X i is stored in N C .

• Index-Support Retrieval Processing
Based on Eqs (8) and (9), suppose that there are n CISs in O, the index keys are inserted by an improved B + -Tree respectively. So for a range retrieval Θ(X R , r R ) and each cluster C j , as illustrated in Fig 5, there are five cases in terms of the positions of the two spheres.
Case 1: in Fig 5(a), the inequalities simðX j r ; X R Þ À r R < r j r and simðX j r ; X R Þ < r R are met, which means Θ(X R , r R ) intersects with YðX j r ; r j r Þ by which X R is contained. So the search range is represented as: ½j � c 1 ; j � c 1 þ dðX j r ; X R Þ þ r j r �; Case 2: in Fig 5(b), the inequalities simðX j r ; X R Þ À r R � r j r and simðX j r ; X R Þ � r R are met, which means Θ(X R , r R ) intersects with YðX j r ; r j r Þ and YðX j r ; r j r Þ does not contain X R . So the search range is represented as: ½j � c 1 þ simðX j r ; X R Þ À r R ; j � c 1 þ r j r �; Case 3: in Fig 5(c), the inequality simðX j r ; X R Þ þ r j r � r R is met, which means Θ(X R , r R ) contains YðX j r ; r j r Þ. So the search range is represented as: ½j � c 1 ; j � c 1 þ r j r �; Case 4: in Fig 5(d), the inequality simðX j r ; X R Þ þ r R � r j r is met, which means YðX j r ; r j r Þ contains Θ(X R , r R ). So the search range is represented as: ½j � c 1 ; j � c 1 þ r j r �; Case 5: in Fig 5(e), the inequality simðX j r ; X R Þ > r j r þ r R is met, which means Θ(X R , r R ) does not intersect with YðX j r ; r j r Þ. No candidate sequences are retrieved. For the similarity retrieval in the MECN, there are two cases in terms of whether there exists a partial answer in N E : 1) the complete answer sequences(C) are directly obtained from N C based on cIndex (see Algorithm 8); 2) the complete answer sequences (C) are composed of the partial answer ones (C 0 ) obtained from N E (see Algorithm 6) and the partial answer ones(C 00 ) from N C (see Algorithm 7).
Algorithm 6 details the similarity range retrieval of the CISs based on the eIndex in N E . The routing Search() is the implementation of the range similarity search in the improved B + -Tree which is described in Algorithm 9.
Algorithm 6 ESearch(X R , r R , O 0 ) input: Θ(X R , r R ): the retrieval CIS, Ω 0 : the CIS whose ETypes are 'T' in N E output: Ψ 0 : the partial answer CISs from N E 1: Ψ 0 Φ; . initialization 2: Ψ 0 Search(X R , r R , Ω 0 ) 3: for each candidate CIS(X i ) 2 Ψ 0 do 4: if sim(X R , X i ) > r R then 5: Ψ 0 Ψ 0 − X i ; if LNode(X i ).EType = 'F' then 8: LNode(X i ).EType 'T'; . for eIndex 9: LNode(X i ).CType 'F'; . for cIndex 10: update the information of X i (e.g., access frequencies and access time) in the log file; 11: end if 12: end if 13: end for 14: return Ψ 0 Similarly, Algorithm 7 summarizes the index support partial similarity range retrieval of the CISs at the cloud node level. It is worth mentioning that LNode(X i ).CType = 'T' means X i is not buffered in N E . Algorithm 7 CSearch(X R , r R , O 0 ) input: Θ(X R , r R ): the retrieval CIS, Ω 0 : the CIS whose ETypes are 'T' in N c output: Ψ 0 : the partial answer CISs from cloud node if LNode(X i ).EType = 'F' then 8: LNode(X i ).EType 'T'; . for eIndex 9: LNode(X i ).CType 'F'; . for cIndex 10: if simðX R ; X i Þ À r R < r j r and simðX j r ; X R Þ < r R then 4: else if simðX R ; X i Þ À r R � r j r and simðX j r ; X R Þ � r R then 6: else if simðX j r ; X R Þ þ r j r � r R them 8: else if simðX j r ; X R Þ þ r R � r j r then 10: • Index Update When user submits a retrieval request, the eIndex needs to be updated by adding the CISs that have been accessed in this retrieval. Since the number of CISs buffered in N E is limited, how to optimally choose the buffered CISs is challenging.
For example, assume that there are six CISs in N E , Tables 2 and 3 illustrate the ranking of access time (AT) and access frequencies (AF) for the six CISs, respectively. In Table 2, the ATs of the six CISs are sorted in an ascending order, which are quantitatively represented by the AT_IDs. Then a weighted AT (WAT) can be derived as follows: Similarly, for the ranking of the access frequencies(AF), a weighted AF(WAF) is represented in Eq (11): Based on Eqs (10) and (11), given a CIS j , its uniform ranking score (URS) is shown below:

Edge buffering
Unlike the traditional image retrieval methods, which directly obtain data from the remote cloud, if the answer CISs can be directly obtained from the edge without accessing the cloud, it will greatly shorten the long-distance transmission delay and improve the retrieval efficiency. Based on the above motivation, we propose an edge buffering scheme by analyzing the user historical retrieval (HR) log file. The refinement cost of the candidate CISs can be significantly decreased with the help of the buffering scheme since a portion of answer CISs can be retrieved directly without any refinement processing.
Specifically, assume that n HRs have been successfully completed with accurate results. Due to the fact that the answer CISs provided by each HR have been verified, when a user submits a new retrieval CIS (i.e., X R ), it is highly possible that X R may be similar or even the same as the HR one (i.e., X 0i R ). As a result, the retrieval efficiency and accuracy can be greatly improved if the HR results in N E can be carefully reused as a part of the current results.
Definition 7(CRS). Given a retrieval CIS X R and a retrieval radius r R , their corresponding CIS retrieval sphere (CRS) is a high-dimensional sphere with a centre X R and a radius r R , denoted as Θ(X R , r R ).
Definition 8(HCRS). Given a HR CIS X 0i R and a retrieval radius r 0i R , their corresponding historical CIS retrieval sphere (HCRS) is a high-dimensional sphere with a centre X 0i R and a radius r 0i R , denoted as YðX 0i R ; r 0i R Þ. Definition 9(AA). Given a CRS Θ(X R , r R ) and a HCRS YðX 0i R ; r 0i R Þ, their corresponding affected area (AA) is the intersection part of the two spheres, formally denoted as: R ; r 0i R ÞÞ. For example, as shown in Fig 6, there are three HCRSs, i.e., YðX 01 ; r 01 R Þ, YðX 02 R ; r 02 R Þ and YðX 03 R ; r 03 R Þ. The current CRS is represented as: Θ(X R , r R ). For X R , it's corresponding 1 st , 2 nd and 3 rd nearest neighbor CISs are X 01 R , X 03 R and X 02 R , respectively. Therefore, the HR of YðX 02 ; r 02 R Þ can be safely discarded since its corresponding HCRS does not intersect with Θ(X R , r R ). The CISs falling in the AA (i.e., YðX R ; r R Þ [ YðX 01 R ; r 01 R Þ and YðX R ; r R Þ [ YðX 03 R ; r 03 R Þ can be a part of the answer CISs of Θ(X R , r R ).
Next, given two CIS retrieval spheres: Θ(X R , r R ) and YðX 0i R ; r 0i R Þ, there exists two cases on the basis of the two retrieval CISs (i.e., X R and X 0i R ), which are shown in Figs 7 and 8, respectively. In Fig 7, if X R ¼ X 0i R , then there exists two cases in terms of the retrieval radii (i.e., r R and r 0i R ).
• For case (a) which is formally represented as: AAðYðX R ; r R Þ; YðX 0i R ; r 0i R ÞÞ ¼ YðX 0i R ; r 0i R Þ, since the CISs falling in the HCRS YðX 0i R ; r 0i R Þ have already undergone verification, they can be part of the answer CISs for Θ(X R , r R ); • For case (b) which is formally represented as: AAðYðX R ; r R Þ; YðX 0i R ; r 0i R ÞÞ ¼ YðX R ; r R Þ, the answer CISs for Θ(X R , r R ) can be derived from the CISs in YðX 0i R ; r 0i R Þ. In Fig 8, if X R 6 ¼ X 0i R , there are four cases according to the placement of the two spheres (i.e., Θ(X R , r R ) and YðX 0i R ; r 0i R Þ). • In case (a), as the AA of the above two spheres does not exist, formally represented as: AAðYðX R ; r R Þ; YðX 0i R ; r 0i R ÞÞ ¼ NULL, the answer CISs need to be calculated sequentially in Θ (X R , r R ); • In case (b), as the AA of the above two spheres exists, formally represented as: AAðYðX R ; r R Þ; YðX 0i R ; r 0i R ÞÞ 6 ¼ NULL. Since the CISs falling in YðX 0i R ; r 0i R Þ have been verified previously, they can be regarded as a part of a candidate CIS set of Θ(X R , r R ); • In case (c), as the AA of the above two spheres is YðX 0i R ; r 0i R Þ, formally represented as: As the CISs that fall in YðX 0i R ; r 0i R Þ have been verified previously; they can be regarded as a part of an answer CIS set of Θ(X R , r R ); • In case (d), as the AA of the above two spheres is Θ(X R , r R ), formally represented as: AAðYðX R ; r R Þ; YðX 0i R ; r 0i R ÞÞ ¼ YðX R ; r R Þ, the answer CISs are contained by the CISs falling in YðX 0i R ; r 0i R Þ.

The BRS algorithm
Before introducing the algorithm, a pre-processing step is first conducted. Algorithm 11 summarizes the detailed steps of our proposed BRS method in which ESearch(X R ,r R ), CSearch(X R , r R ) and DSearch(X R ,r R ) correspond to Algorithms 6-8, respectively. As illustrated in Fig 9, first of all, when a retrieval lung CIS (X R ) is submitted to the edge node level N E from the user one N U , then the eIndex scheme in N E is adopted to quickly judge whether there are some answer CISs similar to X R . If exists, then the high-dimensional similarity retrieval is carried out with the support of the cIndex scheme at the cloud to obtain some partial retrieval answer CISs; otherwise, the similarity retrieval of all CISs supported by the cIndex is performed directly through the cloud, and finally the answer CISs are returned to the receiver node. Note that, in line 9, before transmitting the answer CISs to the receiver, the decryption processing of the CIBs in the CIs need to be performed to ensure the accurate reconstruction and display of the answer CISs. Compared to NIBs, the CIBs have higher transmission priorities. In accordance with the various priorities of the IBs, they can be transmitted in descending order of priority, which not only assures the security of data transmission but also ensures that the critical information can be transmitted first. Algorithm 11 BRS(X R , r R ) input: X R : a retrieval CIS, r R : a retrieval radius output: Ψ: the answer CISs ; . obtain answer CISs(Ψ 0 ) based on the eIndex at the edge node U E ; 3: Ψ 0 6 ¼ NULL then 4: Ψ 00 CSearch(X R , r R ); . obtain the partial answer CISs(Ψ 00 ) based on cIndex at the cloud U C ; 5: Ψ Ψ 0 [ Ψ 00 ; 6: else 7: Ψ DSearch(X R , r R ); . obtain the complete answer CISs based on cIndex at the cloud U C ; 8: end if 9: transmit the CISs in Ψ to the receiver node level with different transmission priorities

Experiments
To verify the efficiency of the proposed BRS method, extensive simulation experiments are conducted to demonstrate the retrieval performance.
Experimental setup. In the experiments, the image receiver client is equipped with a 5.9-inch, full HD 1080p screen and a Qualcomm 1 Snapdragon™ 650 processor running at 1.7GHz. The client system is developed in Java and operates on the Android operating system [37]. The edge node and the cloud one are connected via 1Gbps network links. In the cloud node, the IBs (i.e., CIB and NIB) with various transmission priorities are kept in a file system and some structured data is recorded by the MySQL [38]. Each node contains a 2.7 GHz quadcore Xeon processor, 2.0 Gigabyte memory, and 1.0 Terabyte hard disk. The maximum data communication rate is 150 Mbps in the wireless network.
We selected the LUNA16 dataset [39], which contains 239,232 lung CIs, as our experimental dataset. There are 888 lung CISs in this database, with an average of 336 lung CISs each set. The lung CISs in each set range in level from 200 to 600. A prototype retrieval system. Fig 10 depicts a demonstration of the prototype system. An example of the CIS pre-processing backend interface is shown in Fig 10(a) in which a POA as been marked by a blue rectangle line. In Fig 10(b), a CIS with the category 'lung' has been inputted as a retrieval sequence. Four result CISs were quickly retrieved, and their matching IBs are restored and shown.
Effectiveness of the BRS method. The first experiment testifies the effectiveness of our BRS method by using the lung CISs randomly selected as experimental data. The recall and precision achieved by this retrieval method can be defined as: where rel means the set of ground-truth, and ret refers to the set of results returned by a similarity range search. As shown in Fig 11, performance comparisons of the retrieval effectiveness of the 10 CISs with the same organ (i.e., lung) that are randomly selected from the database are conducted. As can be observed from the figure, precision steadily declines as recall ratio rises. The reason is that when the recall rate is low, it's highly possible that the correctness rate of the result CISs is high. Meanwhile, the high recall rate can not guarantee that the retrieval results contain the correct CISs.
Effect of data size. In this experiment, we investigate the effect of data size (i.e., the number of the lung CISs) on the retrieval efficiency by using the two methods: 1) our proposed BRS method; 2) The MIRC method in [25]. In this experiment, the network bandwidth is 100Mbps and the number of edge nodes is 15, and the UECI framework is used. In Fig 12, with the increase of the CISs, the BRS method is superior to the MIRC since the edge buffer is verified to significantly reduce the retrieval computation cost and transmission delay. Meanwhile, it is interesting to observe that as the data size increases, the overall response time first grows rapidly and then gradually. This is because the index performs better when there is more data.
Effect of ε. The experiment evaluates the effect of ε on the retrieval performance. Similar to the above experiment, the network bandwidth and the number of cloud(edge) nodes are fixed, and the edge buffering scheme and the indexing mechanism are adopted. As demonstrated in Fig 13(a), with the increase of ε, the CPU cost for the similarity computation is decreasing due to the decrease of the number of the RCIs in each CIS. Meanwhile, it's interesting to note in Fig 13(b) that the precision ratio increases rapidly first and then decreases gradually. The reason is that too many or too few RCIs will make it difficult to accurately and completely measure the similarity of the CISs. Therefore, an optimal ε is set to be 0.65. Effect of edge buffering scheme. In this experiment, we proceed to study the effect of the edge buffering scheme on the retrieval performance. Method 1 adopts the edge buffering scheme and method 2 do not use it. Fig 14 demonstrates that the overall response time using method 1 is faster than method 2 when the bandwidth is stable and the retrieval radius (r R ) is fixed. Meanwhile, the performance gap widens as r R steadily grows while the band-width remains constant. This is because with the increase of r R , the probability of obtaining the result CISs in the edge buffering is also increasing.
Effect of indexing scheme. The final experiment examines how the index framework (i.e., eIndex and cIndex) affects retrieval efficiency. Here, method 1 uses the aforementioned two indexes, whereas method 2 does not (i.e., it sequentially searches each cloud node to find the answer CISs). In Fig 15, when the data size and the network bandwidth are fixed, the number of the cloud nodes varies from 10 to 50, the response time for the method 1 (i.e., indexbased retrieval) is growing with the number of the cloud nodes increases. Meanwhile, the performance gap of the two approaches becomes smaller since the response time for method 2 is relatively stable and locating the corresponding candidate CISs based on the index is faster than that of no index. It's interesting to notice that the retrieval response time is the smallest when the number of cloud nodes is 10. The larger the number of cloud nodes involved in retrieval, a large amount of data exchange and transmission will occur, resulting in retrieval delay.

Conclusion
In this paper, we introduced the BRS method-a privacy-preserving batch retrieval of the lung CISs in edge-cloud collaborative computing environment. The goal of our proposed BRS is to provide a safe and efficient retrieval of the lung CISs in resource-constraint network with low and unstable network bandwidth. To enable the efficient BRS processing, four supporting techniques are proposed, namely, 1) batch similarity measure for CISs, 2) CIB-based privacy preserving scheme, 3) uniform edge-cloud index framework, and 4) edge buffering scheme. The experimental results reveal that the efficiency of the BRS method is more than 200% higher than that of the sequential retrieval with the aid of the supporting techniques, especially when the number of cloud nodes is smaller.